Reinforcement learning, Sequential Monte Carlo and the EM algorithm
نویسندگان
چکیده
منابع مشابه
Implementations of the Monte Carlo EM Algorithm
The Monte Carlo EM (MCEM) algorithm is a modification of the EM algorithm where the expectation in the E-step is computed numerically through Monte Carlo simulations. The most flexible and generally applicable approach to obtaining a Monte Carlo sample in each iteration of an MCEM algorithm is through Markov chain Monte Carlo (MCMC) routines such as the Gibbs and Metropolis–Hastings samplers. A...
متن کاملMonte Carlo Bayesian Reinforcement Learning
Bayesian reinforcement learning (BRL) encodes prior knowledge of the world in a model and represents uncertainty in model parameters by maintaining a probability distribution over them. This paper presents Monte Carlo BRL (MC-BRL), a simple and general approach to BRL. MC-BRL samples a priori a finite set of hypotheses for the model parameter values and forms a discrete partially observable Mar...
متن کاملMonte Carlo Matrix Inversion and Reinforcement Learning
We describe the relationship between certain reinforcement learning (RL) methods based on dynamic programming (DP) and a class of unorthodox Monte Carlo methods for solving systems of linear equations proposed in the 1950's. These methods recast the solution of the linear system as the expected value of a statistic suitably defined over sample paths of a Markov chain. The significance of our ob...
متن کاملMonte carlo bayesian hierarchical reinforcement learning
In this paper, we propose to use hierarchical action decomposition to make Bayesian model-based reinforcement learning more efficient and feasible in practice. We formulate Bayesian hierarchical reinforcement learning as a partially observable semi-Markov decision process (POSMDP). The main POSMDP task is partitioned into a hierarchy of POSMDP subtasks; lower-level subtasks get solved first, th...
متن کاملInferring the Structure of Populations of Neurons using a Sequential Monte Carlo EM Algorithm
A fundamental goal of neuroscience is to be able to construct models of a population of neurons acting in concert to perform nonlinear operations. A primary difficulty hindering progress towards this goal is the paucity of computational tools designed with population data in mind. Cross-correlation based ideas become computationally intractable due to the combinatorial explosion. Fitting phenom...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Sādhanā
سال: 2018
ISSN: 0256-2499,0973-7677
DOI: 10.1007/s12046-018-0889-8